Introduction

Project Motivation: Investigate the impact of pandemic restrictions on hockey player development.

In 2020-2021, many hockey leagues had either a shortened season or no season at all due to COVID-19. When a league, such as the OHL, was shutdown, players had to find different leagues and/or tournaments to participate in. Some players were unable to practice with a team for that season. How did this disruption in training influence a player’s development? To answer this question, we will examine data from the 2019-2020, 2020-2021, and 2021-2022 seasons for junior leagues.

Data:

team_name season league gp g a pts pm
Hamilton Bulldogs 2019-2020 OHL 59 3 14 17 2
Windsor Spitfires 2021-2022 OHL 67 11 18 29 28
Barrie Colts 2021-2022 OHL 39 1 5 6 -3
Mississauga Steelheads 2021-2022 OHL 57 8 35 43 12
Soo Greyhounds 2021-2022 OHL 43 1 4 5 -2

What proportion of players played during COVID season?

## [1] "proportion who played is 0.178"

EDA (Treatment)

#### Continuous Age vs PPG

Refit Regression Models

MLR with interactions that make sense

## 
## Call:
## lm(formula = ppg_total ~ position * plyr_quality + treatment * 
##     drafted + gp_total + plyr_quality * gp_total + plyr_quality * 
##     drafted + age_continuous * plyr_quality + season, data = ohl_filtered)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.56219 -0.10026 -0.01353  0.06968  0.83082 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  0.6487089  0.2993550   2.167  0.03079 *  
## positionF                    0.1050757  0.0329205   3.192  0.00152 ** 
## plyr_quality                 0.2821307  0.5193237   0.543  0.58723    
## treatmentPlayed              0.0019808  0.0336923   0.059  0.95315    
## draftedTRUE                  0.0859495  0.0444539   1.933  0.05384 .  
## gp_total                     0.0031135  0.0007901   3.940 9.51e-05 ***
## age_continuous              -0.0482747  0.0174768  -2.762  0.00599 ** 
## season2021-2022              0.3427863  0.0337114  10.168  < 2e-16 ***
## positionF:plyr_quality      -0.0270689  0.0968855  -0.279  0.78008    
## treatmentPlayed:draftedTRUE  0.0327579  0.0589740   0.555  0.57887    
## plyr_quality:gp_total        0.0008258  0.0020820   0.397  0.69185    
## plyr_quality:draftedTRUE    -0.0525461  0.0888778  -0.591  0.55469    
## plyr_quality:age_continuous  0.0325024  0.0285984   1.137  0.25638    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1935 on 425 degrees of freedom
## Multiple R-squared:  0.7396, Adjusted R-squared:  0.7322 
## F-statistic: 100.6 on 12 and 425 DF,  p-value: < 2.2e-16

Transfrom PPG -> log(1 + PPG)

## 
## Call:
## lm(formula = log(1 + ppg_total) ~ position * plyr_quality + treatment * 
##     drafted + gp_total + plyr_quality * gp_total + plyr_quality * 
##     drafted + age_continuous * plyr_quality + season, data = ohl_filtered)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.32901 -0.05826 -0.00607  0.04456  0.39200 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  0.183278   0.177310   1.034  0.30188    
## positionF                    0.084160   0.019499   4.316 1.98e-05 ***
## plyr_quality                 0.865576   0.307600   2.814  0.00512 ** 
## treatmentPlayed              0.002898   0.019956   0.145  0.88459    
## draftedTRUE                  0.071914   0.026330   2.731  0.00657 ** 
## gp_total                     0.002811   0.000468   6.007 4.06e-09 ***
## age_continuous              -0.017665   0.010352  -1.707  0.08864 .  
## season2021-2022              0.208512   0.019967  10.443  < 2e-16 ***
## positionF:plyr_quality      -0.077930   0.057386  -1.358  0.17518    
## treatmentPlayed:draftedTRUE  0.020942   0.034931   0.600  0.54915    
## plyr_quality:gp_total       -0.001254   0.001233  -1.017  0.30985    
## plyr_quality:draftedTRUE    -0.096164   0.052643  -1.827  0.06844 .  
## plyr_quality:age_continuous -0.007997   0.016939  -0.472  0.63712    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1146 on 425 degrees of freedom
## Multiple R-squared:  0.7686, Adjusted R-squared:  0.7621 
## F-statistic: 117.7 on 12 and 425 DF,  p-value: < 2.2e-16

Takeaways:

  • Does not improve upon previous model modeling PPG, maybe even worse.

Transform PPG -> sqrt(1 + PPG)

## 
## Call:
## lm(formula = sqrt(1 + ppg_total) ~ position * plyr_quality + 
##     treatment * drafted + gp_total + plyr_quality * gp_total + 
##     plyr_quality * drafted + age_continuous * plyr_quality + 
##     season, data = ohl_filtered)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.209227 -0.037682 -0.005305  0.026890  0.273866 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  1.1844952  0.1139609  10.394  < 2e-16 ***
## positionF                    0.0472054  0.0125325   3.767 0.000189 ***
## plyr_quality                 0.3270471  0.1977004   1.654 0.098814 .  
## treatmentPlayed              0.0012675  0.0128263   0.099 0.921325    
## draftedTRUE                  0.0395618  0.0169231   2.338 0.019864 *  
## gp_total                     0.0014869  0.0003008   4.943 1.11e-06 ***
## age_continuous              -0.0150411  0.0066532  -2.261 0.024281 *  
## season2021-2022              0.1329864  0.0128335  10.362  < 2e-16 ***
## positionF:plyr_quality      -0.0298963  0.0368832  -0.811 0.418067    
## treatmentPlayed:draftedTRUE  0.0132330  0.0224507   0.589 0.555890    
## plyr_quality:gp_total       -0.0002172  0.0007926  -0.274 0.784167    
## plyr_quality:draftedTRUE    -0.0408806  0.0338347  -1.208 0.227625    
## plyr_quality:age_continuous  0.0038789  0.0108871   0.356 0.721805    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.07366 on 425 degrees of freedom
## Multiple R-squared:  0.756,  Adjusted R-squared:  0.7491 
## F-statistic: 109.7 on 12 and 425 DF,  p-value: < 2.2e-16

Takeaways:

  • Does not improve upon previous model modeling PPG, maybe even worse.

Possible Confounding Variables

Draft

Draft Pick Number with Games Played

Draft Status and Seasons Played

Draft Round and Games Played

Faceted by Season

Faceted by Round

Draft Status and Games Played

Goals and Assists

Plus-Minus

Highest Player

Lowest Player

Median Player

Note: Many teams had the same amount of players with a -1 plus-minus.

Teams

Drafted vs. Undrafted

Points Per Game

Different Ways to Facet

Position

Season

Questions/Concerns

  1. How do I use BART in R?
  2. Are we using BART to get propensity scores?
  3. Can you explain CATE/ what are rescources on CATE?
  4. Next steps for modeling?